THE REACHABILITY PROBLEM FOR AFFINE FUNCTIONS ON THE 

INTEGERS 

DANIEL FREMONT 

rf\ • Abstract. We consider the problem of determining, given x, y £ Z fc and a finite set F of 

y—( ' affine functions on Z fc , whether y is reachable from x by applying the functions F. We also 

consider the analogous problem over N fc . These problems are known to be undecidable 

for k > 2. We give 2-EXPTIME algorithms for both problems in the remaining case fc = 1. 

5_l The exact complexities remain open, although we show a simple NP lower bound. 

<: 

i — i! 1. Introduction 

Many dynamical systems with simple evolution rules nevertheless exhibit unpredictable 

fi , long-term behavior. Multidimensional systems in particular can easily be so complex that 

questions like state reachability are undecidable. For instance, this is the case for states 

in [0, l] 2 evolving under a piecewise-linear function pQ. There are even simpler nondeter- 

t— I \ ministic examples, such as states in Q 2 under a finite set of affine functions [2j. For both 

systems, having states with two coordinates whose evolution is not independent is essential 
to the undecidability proofs. Generally, while it is often not difficult to prove undecidabil- 
ity for systems with sufficiently high dimension, determining if and when the transition to 
decidability occurs at lower dimensions is harder. In particular, it is not known whether 

■<sj- \ the reachability problem is decidable for nondeterministic affine evolution on Q. 

In this paper, we consider the simpler problem of reachability under nondeterministic 
affine evolution on Z: given i,|/GZ and a finite set F of affine functions fi(z) = a,iZ-\-bi 
with a,i, bi € Z, determine whether y is reachable from x by applying functions in F. The 

.£h ' generalizations to Z n are undecidable for all n > 2, as implicitly shown in [3] (and a little 

/\ more clearly in [1J Section 4.9]). We prove that the remaining case, n = 1, is decidable, 

giving a 2-EXPTIME algorithm for it in Section [2j We also consider the version of this 
problem with evolution over N: in this problem, the functions /$ can still have negative 
coefficients, but may not be applied if they would yield a negative result. In Section [3] we 
give a 2-EXPTIME algorithm for this problem by modifying our algorithm for the case over 
Z. Finally, in Section 0] we show that the problems over Z and N are both NP-hard. 
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2. Affine REACHABILITY OVER Z 

We begin by denning some notation that we will use throughout this paper. 

Definition. If S is a set of affine functions on Z, we will call any function of the form 
G = sk ° ■ ■ ■ ° s\ for some K £ N with each Si £ S an S- composition. We will want 
to discuss the individual functions Sj which appear in an S'-composition, so G is formally 
the tuple (si, . . . , sk), but we will often view S-compositions as functions without further 
comment. The orbit of G applied to the argument x is the set of values {x, si(x), (s2 ° 

s±)(x), . . . , (sjfO- • •os 1 )(x)}. We write x — > y wia G to indicate that G is an S'-composition 

such that G(x) = y, and x — > y to assert the existence of such a G. 

In this notation, the affine reachability problem over Z is to determine, given x, y £ Z 

and a finite set F of affine functions fi{z) = a>iZ + bi with ctj, 6j £ Z, whether x — > y. The 
problem takes several qualitatively different forms depending on the values of the linear 
coefficients a%. The simplest nontrivial case is when they all satisfy |oj| > 1. 

Lemma 1. There is an EXPTIME algorithm to decide, given any x,y £ Z and a finite set 
F of functions fi(z) = aiZ + bi with ai,bi £ Z and satisfying |oj| > 1, whether x — s* y. 

Proof. Outside of some finite interval, for instance [-Q, Q] with Q = 1+max |6j|, each func- 
tion fi strictly increases absolute value. Putting R = max{Q, |y|}, for any F-composition 
G and z £ Z with \z\ > R we have |G(z)| > \z\ > \y\ and thus G(z) ^ y. This means 
that all preimages of y under F-compositions must lie in the finite interval I = [—R,R\. 
Create a directed graph D with a vertex for each integer z £ J, and add edges from z to 

fi{z) for each i satisfying fi{z) £ /. Since every preimage of y under an F-composition lies 

p 
in /, we have x — > y if and only if x £ / and there is a path in D from x to y. We can 

determine whether such a path exists in exponential time using graph search, since D has 

exponentially-many vertices (at most linearly-many in the values of bi and y). □ 

Remark. As will be important later, with a small modification of this algorithm we can 
handle the presence of one fj with a,j = — 1, so that fj(z) = —z + bj. The only change 
necessary is to broaden / to the interval I' = [min(— R, —R + bj), max(i?, R + bj)]. Then 
/ _ = fj maps I' onto itself, so all preimages of y under .F-compositions are in I', and the 
argument goes through as above. 

However, when a function in F is of the form g(z) = z + k this method breaks down, 
because the preimages of y under .F-compositions are no longer bounded. This also happens 
if there are two functions of the form f(z) = —z + b, since then their composition is of the 
form g(z) = z + k. Fortunately, functions like g can contribute to an F-composition in 
basically only one way. 

Lemma 2. For any set S of functions fi(z) = aiZ + bi with a,i, bi £ Z for 1 < i < N and 
function fo(z) = z + k, put F = S U {/o}- Then for any F -composition G, we have: 

(a) If G = f eo o ■ ■ ■ o f e . o /" o f e . +1 o ■ ■ ■ o f eK , then G(z) = H(z) + ank where H = 
fe ° ' • ' ° fe K and a = a eo . . . a ej . 
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(b) G(z) = H(z) + ak for some S '-composition H and a£Z, If ai > for every function 
fi appearing in G, then a > 0. 

Proof, (a) We have (/* o /")(» = a;(z + nk) + h = {a l z + fej) + a^nA; = (/ ain o £)(*), so 

G = f eo O • • • O f e . O f£ O f e . +1 O • • • O f eK = f eQ O ■ ■ ■ O / ej _ 1 O f Q 3 O f e . o • • • o / ejf . 

Repeating this j more times gives G = f ° j o f eo o ■ ■ ■ o f ei< = iif(z) + ank with 
H = f eo o ■ ■ ■ o f eK and a = o eo . . . o 6j . . 
(b) Apply the previous result once for each instance of /o in G, giving G = /q° o • • • o f^ L o 
feo ° " ' ° fe K f° r some Co, . . . C£ G Z which are products of the coefficients Oj. Then 
G(z) = ff(^) + aAi with H the S-composition H = f eo o- ■ ■ o f eK and a = Co + • • • + cl- 
If a. L > for each /j appearing in G, then a > and so a > (we could have a = if 
there were no instances of /o in G). □ 

We now have several cases, based on which S-compositions G satisfy x — » y (mod A;) 
via G. 

Lemma 3. For any x,y G Z, finite set S of functions f%(z) = a^z + b% wii/i a%, bi G Z 
and aj ^ 0, and function g{z) = z + k with k G Z and I: / 0, pu£ i 7 = S 1 U {5} and 

G = {G : x — » y (mod A;) ma G}. Then the following are true: 



(A) I/G = 0, thenx-j^y. 

(B) If some G G G safe, 
fGj // some G G G w;zi/i G = f eo o • • • o / eK /ias a e .. < /or some j, then x —> y. 



(B) If some G G G satisfies sgn(G(x) — y) 7^ sgn(fe), i/ien x — ► y. 



jp 

(D) If none of the above cases hold, then x /-> y. 

Proof. (A) By Lemma QE1 any F-composition can be written as an S-composition plus a 
multiple of k. If G = 0, then no S-composition can reach y (mod k) from x, and 
therefore neither can any i^-composition. 

(B) If some G G G satisfies sgn(G(x) — y) 7^ sgn(A;), then either sgn(G(x) — y) = 
or sgn(G(x) — y) = — sgn(A;). If the first of these is true, then G(x) = y and so 

x — > y via G. Otherwise, there is some n G N such that G(x) — y = —nk, so 

n = G(x) + nA; = (g n o G)(x). Putting G' = g n o G, we have x — )■ y via G'. 

(C) If some G G G with G = f eo o ■ • ■ o / eK has a ej < for some j, take the smallest 

such j. Defining G' = f eo o • • • / ej o 5 n o / ej+1 o ■■■ o f eK , by Lemma Ofe] we have 

G'(z) = G{z) + anA; with a = a eo a ei • • • a e . , and because a efc > for all k < j 

by our choice of j, we have a < 0. We may assume that case [B] does not hold, since 

w 
otherwise we have x — > y immediately as shown above. Then we have sgn(G(x) — y) = 

sgn(A;), and so sgn(G(x) — y) = —sgn(ak). Therefore with n sufficiently large we have 

sgn(G'(x) — y) = sgn(G(x) — y + anA;) = — sgn(G(x) — y) = — sgn(/c). Then as in case 

771 

IB"| we have x — ►• y via G" = a m o G' for some m G N. 

(D) Suppose that x — > y via some i^. Then by Lemma IJbl we have H = g a o G for 
some 5-composition G and a G Z. Now G(x) = -ff(x) — ak = H(x) = y (mod A;), so 
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G € G. Since case iBl does not hold, we have sgn(G(x) — y) = sgn(k). Since case ICl 
does not hold, we have a > 0, again by Lemma I2E1 If a = 0, then G(x) = H(x) = y, 
so sgn(G(x) — y) = ^ sgn(fc) and case [B] holds, contrary to our assumption. So 
a > 0, and thus sgn(G(x) — y) = sgn(/s) = sgn(aA;). But this is impossible, since 

G(x) — y = H{x) — y — ak = —ak. So we cannot have x — > y. □ 

To test cases (|A]), (|Bj), and ((Cj), we use the following algorithm. 



Lemma 4. Given any x,y,k £ Z mi/i fe ^ and a se£ F of functions fi{z) = a{Z + 6j 

iot£/i a,, hj e Z and aj 7^ /or 1 < i < N, put G = {G : x — > y (mod k) via G}. There is a 
2-EXPTIME algorithm, such that: 

(1) If G = 0, the algorithm returns Empty. 

(2) If there is some G G G with G = / eo °" • ~°fe K and o. e < /or some j, i/ie algorithm 
returns NEGATIVE. 

(^ Otherwise, the algorithm returns sup {G(x) : G G G}. 

The flags Empty and Negative are enough for us to detect cases |A] and O of Lemma 
El If k < 0, the value V = sup {G(x) : G E G} allows us to recognize case El because 
this case holds if and only if V > y. If k > we need the value V 1 = inf {G(x) : G G G} 
instead, since then case |B] holds if and only if V' < y. The modifications to the algorithm 
of Lemma H] required to make it compute V' instead of V are simple and obvious (just 
exchanging "increases" with "decreases" in several places, etc.), so we omit them. Now we 
prove Lemma U assuming a couple of auxiliary lemmas (Lemmas [5] and E]) which we will 
return to afterwards. 

Proof of Lemma^4\ First we check if there is any F-composition mapping x to y (mod k). 
Create a directed graph D with a vertex for each congruence class mod k. Add edges 
indicating which classes are mapped to which under each /,. Then there is a path in D 

rp 

from the congruence class of x to the congruence class of y if and only if x — > y (mod k). 
Use graph search to determine if there is such a path, and return Empty if not. Since D 
has \k\ vertices, this search takes exponential time. 

If there are paths from x (mod k) to y (mod k), we need to analyze all of them to see 
which ones yield the largest final value. We can conveniently describe the paths using 
regular expressions. Consider D to be a deterministic finite automaton, where an input 
symbol e 6 {1, ... ,N} causes the edge corresponding to applying f e to be followed. Let 
the initial state be x (mod k), and the only accepting state be y (mod k). If s = e\ . . . e% 
is a sequence of input symbols, we write P s = f eic o • ■ • o f ei (note the order!), and then D 
accepts s if and only if P s (x) = y (mod k). 

Now we convert D into a regular expression R with the same language L(R). We 
write concatenation multiplicatively, use | to denote union/alternation, and use e and 
as the symbols for the empty string and the empty language respectively. Because D has 
exponentially-many vertices (and a linearly-sized input alphabet), R has at most doubly- 
exponential size \R\, and the conversion from DtoR can be done in time at most polynomial 
in \R\ (see [11 [6]). We store R as a tree, with literals, e, or at the leaf nodes and operators 
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at the other nodes. The "length" \R\ in this representation is just the total number of 
nodes. 

Next, reduce R to not include the symbol by repeatedly passing through R applying 
the identities E\$ = E, E® = 0, and 0* = e for any expression E. Each pass takes time 
linear in the length of R and strictly decreases its length, so there can be at most \R\ 
passes and the total time taken is 0(\R\ 2 ). Afterwards, if the symbol appears in R it 
must not be operated on by any operator, since otherwise one of the identities above would 
apply. Therefore can appear in R only if R = 0, but since L(R) is nonempty (because 
we returned Empty above if so) this is not the case. So R does not contain the symbol 0. 

Now if R contains a literal corresponding to a function fi with a% < (which can 

obviously be determined in 0(|i?|) time), then x — > y (mod k) via an F-composition which 
includes /j, so we return Negative. Otherwise, we convert R into disjunctive normal form 
Si 1 52 1 • ■ ■ | Sm where each Si has no union operations, by iteratively applying the identities 
(a\/3)* = (a* /3*)*, a(/3\j) = a/3\aj, and (a|/3)7 = ary\/3j. Each identity either decreases 
the number of unions or moves one closer to the topmost level, so this process will also 
take time polynomial in \R\. 

We say that a regular expression E is reduced if it contains only literals appearing in 
R and has no symbols or union operations. The reductions we have done above ensure 
that any expression produced by concatenating subexpressions of the clauses Sj is reduced. 
Since every literal in R corresponds to a function /j with a% > (because we would have 
returned Negative otherwise), and the composition of two linear polynomials with positive 
linear coefficients has a positive linear coefficient, for any reduced expression E and any 
s € L(E), P s has a positive linear coefficient — this will be important in a moment. We 
will want to refer to those F-compositions which are generated by reduced expressions, 
and to decrease the proliferation of symbols, we will say that an F-composition G matches 
the expression E if there is some s £ L(E) such that G = P s . Then G consists precisely 
of those F-compositions which match R. 

Now, given some z € Z and a reduced expression E, define I(z,E) to mean that 3s € 
L(E) : P s (z) > z. In words, I(z,E) is true if and only if there is some F-composition 
matching E which increases z. If L(E) is finite, computing I(z, E) is only a matter of 
testing various cases — the difficulty is handling expressions with stars. Fortunately, we 
can reduce I to its values on expressions with fewer stars using the identity I(z, £a*(3) <=> 
I(Pi(z),a) V I(z,£/3), which we will prove in Lemma [5j This then allows us to compute 
I(z, E) recursively in polynomial time, as we will show in Lemma [6j 

Now we are ready to return to the main problem. For each clause Si, we want to find 
the supremum V% of the possible values x is mapped to by any F-composition matching 
Si. To do this, we keep track of the supremum of the values x is mapped to by F- 
compositions which match progressively- longer prefixes of Si- Write Si = T\...Tk by 
flattening out concatenations, so that each Tj is either a literal or a starred subexpression. 

Let Xj = sup {P s (x) : s € L(T\ . . . Tj)} for 1 < j < K. Clearly Vi = x^J, and we 

put Xq = x (since the largest possible value reachable after applying no functions is the 
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starting value x). For j > 1, we calculate rzA in terms of Xa_i as follows. If Tj is a literal, 
then any F-composition matching T\ . . . Tj must be of the form Pp. o q where q is an F- 
composition matching T\ . . . Tj—\. Since by definition the largest possible value of q(x) is 

(i) 

Xa_ x , and Pt 3 has a positive linear coefficient, the largest possible value of {Pt^ o q)(x) is 
Ptj(xj-i)- Thus x* = Ptj{xj-i)- If Tj is a starred subexpression instead, Tj = a*, we 
compute J(aj^_-|_, a). If this is true, then some F-composition p matching a increases Xj_-j_, 
and because p has a positive linear coefficient it must increase all values larger than x^_ v 
So we can increase Xa!_\ as much as we want by repeatedly applying p, and thus x- = oo. 
If I(xj_i,a) is false, then no F-composition matching a increases x^_ 1} so x? = XjL\ 
(since a* is matched by P € , which leaves Xj_± fixed). Thus we can iteratively compute Vi, 

beginning with Xq = x and proceeding through x^ = V{. There are 0(|Sj|) intermediate 

(i) 
values x- which need to be computed, and each one requires at most one call to / on an 

expression of size at most 0(|iSj|). Thus we can compute each Vi in time polynomial in 

\Si\, and all the values Vi together in time polynomial in \R\. 

Now put V = maxVi. Because the union of the languages of each clause Si is the 
language of R, V is the largest value reachable using F-compositions matching R (or oo 
if arbitrarily large values are reachable). Therefore V = sup {G(x) : G G G}, the desired 
value, and we return it. 

As mentioned above, the first stage of this algorithm takes exponential time, and all 
subsequent stages take time at most polynomial in \R\. Since \R\ is at most doubly- 
exponential in the input length, the entire algorithm takes at most doubly-exponential 
time. □ 

Lemma|4]depends on our ability to efficiently calculate I(z, E). As was mentioned above, 
the key to doing this is to reduce I(z, E) to values of I on smaller subexpressions, as made 
possible by the following lemma. 

Lemma 5. If z € 7L, a and (3 are reduced regular expressions, and £ is a sequence of 
literals, the following are true: 

(1) I(z,ta*p) ^^ I{z,ip)Vl{P t (z),a) 

(2) I(z,£a*) ^ I(z,l)Vl(P e (z),a) 

(3) l(z,a*/3) ^^ I(z,a)Vl(z,P) 

(4) I(z,a*) ^ I(z,a) 

Proof. (1) I(z,£{3) implies I{z,£a*j3) because If} matches £a*/3. If I(Pt(z),a) holds, 
then there is an .F-composition p matching a which increases Pe(z). Because p has 
a positive linear coefficient as observed above (since a is reduced), p must increase 
anything greater than Pi(z), and thus repeated applications of p can increase Pe(z) 
as much as desired. Therefore for any particular F-composition q matching /3, there 
is some n £ N such that q o p n o Pg increases z (since j3 is reduced and so q also has 
a positive linear coefficient). Since q op n o Pi matches £a*(3, I(z,£a*f3) holds. 
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Suppose neither I(z, ij3) nor I(Pi(z), a) hold. Any F-composition matching la* (3 
is of the form qopoPg where q matches (3 and p is a composition of F-compositions 
matching a. Since by our assumption no polynomials matching a increase Pi(z), 
and these have positive linear coefficients, they cannot increase anything smaller 
than Pi(z). Thus no composition of F-compositions matching a increases Pi(z), 
and sop does not increase Pi(z). Now, since qoPg does not increase z by assumption 
(since it matches lf3), and q has a positive linear coefficient, qopoP^ does not increase 
z either. Thus no F-composition matching £a*/3 increases z, and I(z,la*(3) does 
not hold. 

(2) Put /? = e in (P) and use L(£a*e) = L(la*) and L(le) = L{£). 

(3) Put I = e in (JU and use L(e/3) = L(J3) and P e {z) = z. 

(4) Put /3 = e in ([3]), use L(a*e) = L(a*), and note that I(z,e) is clearly false. □ 

These relationships allow us to give a straightforward recursive algorithm to compute /. 

Lemma 6. There is a P algorithm to compute I(z, E) for any z£Z and reduced regular 
expression E. 

Proof. First, note that if an expression a is a sequence of literals, then only P a matches 
it, and we may determine I(z,a) = (P a (z) > z) directly by evaluating P a (z). Now we 
compute I(z, E) recursively, breaking into cases based on the topmost operator or symbol 
a£E: 

• E = e: I(z, E) is clearly false. 

• E is a literal: As noted above, I(z,E) = (Pe(z) > z) may be directly calculated. 

• E = F*: By part H of Lemma I(z,E) =I(z,F*) =I(z,F). 

• E = FG: By flattening as necessary, we may write E = F1F2 ■ ■ ■ Fk for some K E N 
with K > 2 and where none of the subexpressions Fi are concatenations. If any 
of the subexpressions Fi are e, we simply drop them and renumber appropriately 

- this obviously leaves L(E) fixed. Thus we may assume that each subexpression 
Fi is either a starred subexpression or a literal. If each one is a literal, then E 
is a sequence of literals and as noted above I(z, E) can be computed directly. 
Otherwise, find the smallest j such that Fj = a* is a starred subexpression. There 
are several cases: 

— j = 1: Then E = a* /3 where j3 = F2 ■ ■ ■ Fk, so by part [3]of Lemma[5] we have 
I(z, E) = I(z, a* (3) = I(z, a) V I(z, /3). 

— j = K: Then E = la* where I = F\- ■ ■ Fk-i is a sequence of literals, so by 
part [2] of Lemma Ewe have I(z,E) =I(z,la*) = I(z,l) V I(P e (z),a). 

— 1 < j < K : Then E = la* (3 where I = F\- ■ ■ Fj—\ is a sequence of literals 
and /3 = Fj + \ ■ ■ ■ Fk, so by part CD of Lemma [5] we have I(z, E) = I(z, la* (3) = 
I(z,l(3)Vl(P e (z),a). 

In each case I(z, E) is either directly computable, equivalent to I(x, F) for some i£Z 
and F a proper subexpression of E, or equivalent to I(x, F) V I(y, G) for some x, y € 
Z, F a proper subexpression or concatenation of disjoint subexpressions of E, and G a 
subexpression of E disjoint from F. Because I(z, E) is always reduced to values of / on 
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strictly shorter expressions, the tree of recursive calls has at most 0(|i?|) levels. Since 
I(z, E) is always reduced to values of I on disjoint expressions made up of subexpressions 
of E, each level of the tree can have at most 0(|i?|) calls. Thus the entire tree has at 
most 0(|.E| 2 ) calls with polynomial computation each, so I(z,E) may be computed in 
polynomial time. □ 

We can now give an algorithm to solve the affine reachability problem over Z in full 
generality. 

Theorem 1. There is a 2-EXPTIME algorithm to decide, given any x,y € Z and a finite 
set F of functions fi(z) = a^z + hi with ai,bi £ Z, whether x — > y. 

Proof. There are several cases: 

(1) For some j, a,j = 0: Clearly, x — > y if and only if either x > y or bj — > y; 

recursively determine each of these and return true if and only if at least one is 
true. 

(2) For some j, aj = 1 and bj = 0: Clearly x — > y if and only if x °—> y (since fj 

is the identity), so determine this recursively and return the result. 

(3) For some j, aj = 1 and bj ^ 0: Assume for now that bj < (we will handle 
bj > momentarily). Run the algorithm of Lemma H on F \ {fj} with k = bj. 
If it returns Empty, then case [A] of Lemma [3] holds, so return false. If it returns 
Negative, then case \C\ holds, so return true. Otherwise, the algorithm returns 

F\ if T 

V = sup {G(x) : x — > y (mod bj) via G}. Since now either case IB1 or caselDlof 

Lemma [3] holds, x — > y if and only if sgn(l/ — y)^ sgn(6j) = — 1. So we return true 
if and only if V > y. If bj was in fact positive, we use the variant of the algorithm 

of Lemma H] which computes V = inf {G(x) : x — > y (mod bj) via G}. By 

Lemma [3] again we have x — > y if and only if sgn(V — y) ^ sgn(6j) = +1, so we 
return true if and only if V' < y. 

(4) For some j, aj = —1, and \ai\ > 1 for all i ^ j: Use the algorithm of Lemma [lj 
modified as in the remark to handle fj(z) = —z + bj. 

(5) For some j, k with j ^ k, a,j = a& = — 1: Define g = fj o f k . Clearly x — > y if and 

only if x > y. But g(z) = (fj o f k )(z) = -(-z + b k ) + bj = z + (bj - b k ), and 

since bj ^ b k (since fj and f k are distinct functions), g(z) = z + c for some c G Z 

with c/0. Recursively solve x > using case ([3]) and return the result. 

(6) Otherwise, [oi| > 1 for all i: Use the algorithm of Lemma [TJ 

Cases (jl]) and ([6]) invoke the algorithm of Lemma Q] and take exponential time. Case ([3]) 
invokes the algorithm of Lemma H] and takes doubly-exponential time. Case ([5]) makes a 
recursive call which always uses case ([3]), so it also takes doubly-exponential time. Finally, 
cases ([2]) and (pQ) make one and two recursive calls respectively, each with one less affine 
function. Thus in total the algorithm will make at most exponentially-many recursive calls 
(this can easily be reduced to linearly-many by improving the handling of case ([!]) , but this 
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does not decrease the worst-case runtime), with at most a doubly-exponential amount of 
computation each. Therefore the algorithm runs in at most doubly-exponential time. □ 

3. Affine reachability over N 

We can also consider the affine reachability problem over N. Much of the analysis is the 
same, so we will only write out in full detail the considerations which are new. The main 
difference from the version over Z is that now we cannot apply any functions which would 
yield a negative result. 

Definition. An i^-composition is valid with respect to its argument if every integer in its 

orbit for the given argument is nonnegative. Often the argument of the composition will 

be clear from context, in which case we will simply say that the composition is valid. We 

w 
write x — ?■+ y to indicate that there is a valid F-composition G such that G(x) = y. 

With this definition, the affine reachability problem over N is to determine, given x, y S N 

Tp 

and a finite set F of functions fi(z) = a{Z + bi with at, bi € Z, whether x — >+ y. As before, 
there are various cases depending on the values of the linear coefficients a,. The case where 
they all satisfy |oj| > 1 is still simple. 

Lemma 7. There is an EXPTIME algorithm to decide, given any x, y € N and a finite set 
F of functions fi(z) = aiZ + bi with Oj, bi £ Z and satisfying \m\ > 1, whether x — >+ y. 

Proof. Use the algorithm of Lemma [H but with the interval I = [0, R] instead of [-R, R\. 
The argument in the proof of Lemma Q] goes through as before, since all preimages of y 
under valid .F-compositions must lie in I. □ 

Remark. This algorithm also works with any number of functions of the form g(z) = z + k 
with k > 0, since these strictly increase absolute value on N \ I, and so preimages of y 
under valid compositions including them must lie in /. 

The algorithm of Lemma [7] cannot handle functions of the form g(z) = z — k with k > 0. 
Fortunately, as before we can reduce problems with this type of function to "modular" 
problems without them, using the following (much simpler) analog of LemmaO We assume 
for now that all <n > 0, and show how to handle other cases later. 

Lemma 8. For any x,y G N, a set S of functions fi(z) = aiz + bi with a^bi € Z and 
<2j > for 1 < i < N, and function g(z) = z — k with k > 0, put F = S U {g}. Then 

x — > + y if and only if x — >+ z = y (mod k) for some z > y. 

Proof. Suppose x — >+ y via G. By LemmaQEl G = g a oH with a > and H being G with 
all instances of g removed. Since G{x) = y, we have H(x) = y (mod k). Now note that g 
always decreases its argument, and since every Oj is positive, each fi maps larger inputs to 
larger outputs. Therefore removing an instance of g from G can only increase the values 
of integers in its orbit. Since G is valid this means H must be as well, and since G(x) = y 

this means H (x) > y. Therefore x — > + H{x) = y (mod k) with H{x) > y. 
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g 

Conversely, suppose x — >+ z = y (mod A;) via H with z > y. Then (g n o H)(x) = 

z — nk = y for some n E N. Putting G = g n o H, G is valid because -ff is, and so we have 

p 
x — >+ y via G. D 

The algorithm of Lemma[4]is almost exactly what we need to test the condition in Lemma 
[SI since it computes the largest z = y (mod k) reachable by 5-compositions. However, we 
must consider only those reachable by valid S'-compositions, and so need to modify the 
algorithm. This is not hard to do. 

Lemma 9. Given any x, y E N, k E Z with k ^ 0, and a set F of functions fi(z) = aiz + hi 

w 
with di,bi E It and 04 > for 1 < i < N, put G = {G : x — >■+ y (mod k) via G}. 

There is a 2-EXPTIME algorithm which returns Empty if G = 0, and otherwise returns 

sup {G{x) : G E G}. ' 

Proof. As in the algorithm of Lemma HI construct the graph D and search it to determine if 
x — > y (mod k) via any F-composition, not necessarily a valid one. If not, return Empty. 
Otherwise, consider D to be a finite automaton as before, and convert it into a reduced 
regular expression R in disjunctive normal form R = S\\ . . . \Sm- 

Given some z€N and £ a sequence of literals which appear in R, we define V(z,£) to 
mean that Pi is a valid F-composition with respect to z. Now for any z £ N and reduced 
expression E, define I'(z,E) to mean 3s E L(E) : V(z,P s ) A (P s (z) > z). In words, this 
means that there is some i^-composition valid with respect to z which matches E and 
increases z (this is just the analog of I(z,E) from Lemma SI but restricted to only valid 
compositions). By an extension of Lemma [5] which will prove momentarily, Lemma [TUl we 
have that I'(z,£a*f3) <^=> V(z,£) A {I'{Pi{z),a) V I'(z,£/3)). Using this we may compute 
I'(z, E) in polynomial time with the analog of the algorithm of Lemma[6l described shortly 
in Lemma [TT1 

Now we continue as in the algorithm of Lemma [H writing Si = T\ . . . Tk and defining 
an = sup {P s {x) : s E L(T\ . . . Tj)}. We calculate the values ar- in the same way as 

before, except using I' in place of I when dealing with starred subexpressions Tj. This 

(i) 
ensures that only F-compositions which are valid with respect to x._ l are used to compute 

Xj when Tj is a starred subexpression. When Tj is a literal, we put Xj = Ptj(xj-i) as 

usual, but also check if ar- < 0. If so, then Ptj is not valid with respect to Xj_ lf and thus 
no .F-composition matching Si can be valid with respect to x. So we discard Si and move 

(i) (i) 

on. Otherwise again ar- can be obtained from x,_ 1 using a valid F-composition. Then if 

(i) (i) 

we compute x K without discarding Si, the value x K can be obtained from x using a valid 

F-composition, and so Vi = x K is the supremum of possible values x is mapped to by any 
valid F-composition matching Si. 

If we discarded every Si, then no valid F-compositions match R, so G = and we return 
Empty. Otherwise, if V is the largest of the values Vi (at least one of these is defined since 
we did not discard every Si) then V = sup {G(x) : G E G} and we return it. As in Lemma 
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U this algorithm takes time polynomial in \R\, and thus takes at most doubly-exponential 
time. □ 

Now we prove the analog of Lemma [5] for I' . 

Lemma 10. If z £ ~N, a and (3 are reduced regular expressions, and £ is a sequence of 
literals, the following are true: 

(1) I'{z,£a*P) <=^ V{z,£)A(I'(z,£p)\/I'{P i {z),a)) 

(2) I'(z,£a*) <=* V(z,£) A(l'(z,£)V l'(P e (z),a)) 

(3) I'(z,a*(3) ^^ I'(z,a)V I'(z,/3) 

(4) I'(z,a*) <=^ I'(z,a) 

Proof. (1) I'(z,£/3) implies I'(z,£a*(3) because 1(3 matches £a*p. If I'(P e (z),a) holds, 
then there is an F-composition p matching a which is valid with respect to and 
increases Pe(z). Because p has a positive linear coefficient (since we assumed all a» > 
0), p must increase anything greater than Pi(z), and thus repeated applications of p 
can increase Pe(z) as much as desired. Therefore for any particular F-composition 
q matching P, there is some n £ N such that q o p n is valid with respect to and 
increases Pt(z), and q o p n o P^ increases z. If we also have V(z, P#), then Pi is 
valid with respect to z, and then q o p n o Pg is valid with respect to and increases 
z. Then, since q o p n o P^ matches £a* f3, I'{z,£o* f3) holds. 

Suppose V(z,£) does not hold. Then no F-composition p matching £a* j3 can be 
valid with respect to z, since the first part of p must be Pi, which is not valid with 
respect to z and thus will cause some integers in the orbit of p to be negative. So 
I'(z,£a*(3) does not hold. 

Suppose that neither I'(z,£(3) nor I'(Pe(z),a) hold. Any valid F-composition 
matching £a* f3 is of the form qopoP^ where q matches f3 and p is a composition of F- 
compositions matching a. Since by our assumption valid F-compositions matching 
a do not increase Pi(z), and these have positive linear coefficients, they do not 
increase anything smaller than Pe(z). Thus no valid composition of F-compositions 
matching a increases Pe(z), and so p does not increase Pi{z). Therefore because q 
is valid with respect to (poPp)^), qoPn is valid with respect to z, and since I'(z, £j3) 
does not hold, qoPt does not increase z. Because q has a positive linear coefficient, 
q o p o Pi does not increase z either. Thus no valid F-composition matching £a* j3 
increases z, and I(z,£a*/3) does not hold. 

(2) Put = e in © and use L(£a*e) = L(£a*) and L(£e) = L{£). 

(3) Put £ = e in (P) and use L(e/3) = L(J3) and P e {z) = z. 

(4) Put P = e in ([3|), use L{a*e) = L(a*), and note that I'(z, e) is clearly false. □ 

As before, these relationships yield a recursive algorithm for computing I': 

Lemma 11. There is a P algorithm to compute I'(z,E) for any z £ N and reduced regular 
expression E. 

Proof. The algorithm is identical to that of Lemma El except that in addition to reducing 
I'(z,E) to values of I' on shorter expressions, the identities in Lemma [10] also require 
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evaluations of V(z, i). Clearly V(z, £) can be computed in polynomial time, by simply cal- 
culating the entire orbit of P# and checking that every integer in it is nonnegative. Adding 
a single computation of V at each recursive call in the algorithm of Lemma [6] multiplies its 
runtime by a polynomial factor, so the overall algorithm still runs in polynomial time. □ 

Now we can give a general algorithm for the affine reachability problem over N. 

Theorem 2. There is a 2-EXPTIME algorithm to decide, given any x,y G N and a finite 

set F of functions f%(z) = aiZ -\ 

Proof. There are several cases: 

• some 

F\{fj}. 



set F of functions fi(z) = a^z + hi with a,, hi E Z, whether x — > + y. 

[1) For some j, a,j = 0: As in Theorem [H recursively determine x — > y and 



bj y y and return true if and only if at least one holds. 

(2) For some j, a,j < 0: There are only a finite number of z G N which fj can be applied 

to without giving a negative result: for example, they are all in [0, bj]. Create a 

directed graph D with a vertex for each of these, as well as vertices for x and y 

if not already present. Add edges indicating how fj maps these values to each 

other. Use this algorithm recursively on S = F \ {fj} to add edges corresponding 

to mappings by all possible valid S'-compositions. Then there is a path in D from 

p 
x to y if and only if x — ►+ y. Use graph search to test if such a path exists, and 

return the result. 

(3) For some j, a,- = 1 and bj = 0: As in Theorem [H recursively solve x > y and 

return the result. 

(4) For some j, a,j = 1 and bj < 0: Run the algorithm in Lemma [9] on F \ {fj} 

p 
with k = bj. If it returns Empty, then by Lemma [8] we cannot have x — >+ y 

771 

and we return false. Otherwise the algorithm returns V = sup {G(x) : x — >+ y 

(mod k) via G}, and again by Lemma [8] we have x — >+ y if and only if V > y. So 
return true if and only if V > y. 

(5) Otherwise, for all i we have a, > 1, and if ctj = 1 then bi > 0: Use the algorithm of 
Lemma [7] (which works in this case as noted in the remark), and return the result. 

The analysis of the runtime is similar to that given in Theorem [H except for case [2j The 
graph D created in that case has exponentially- many vertices (linear in the value of bj), so 
each invocation of the algorithm makes at most exponentially-many recursive calls. There 
are at most a linear number of levels, since each recursive call has one less affine function 
than its parent. Thus there are exponentially-many recursive calls in total. The work done 
in each call takes at most doubly-exponential time (since the algorithm in Lemma[9]can take 
this long), so the overall running time of the algorithm is at most doubly-exponential. □ 

4. A Lower Bound 

While it may be hoped that there are vastly more efficient algorithms for the affine 
reachability problems than the 2-EXPTIME methods we have given here, the following 
theorem shows that polynomial-time algorithms are unlikely. 
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Theorem 3. The affine reachability problems over Z and N are NP-hard. 

Proof. We give a reduction from the Integer Knapsack Problem (IKP), which is to deter- 
mine, given wi, . . . , wn, C € N, whether there are x±, . . . , xjy G N such that Yli w % x i = 
C. This problem is known to be NP-complete [?]. For a given instance of the IKP, 
wi, . . . , wn, C € N, let the set F consist of the affine functions fi(z) = z + wi for 1 < i < N. 
If there exist xi, . . . , xjy € N such that Yli WiXi = C, then (f^ 1 o • • • o f N N )(0) = Yli w % x % = 

C, so — > C. Since the functions fa all commute, if — > C then C = (/f 1 o • • • o f N N )(0) = 

rp 

^2 i WiXi for some xi,..., x^ G N. Thus — > C if and only if the IKP instance is solvable. 
Computing F given an IKP instance can obviously be done in polynomial time, so this 
gives a polynomial-time many-one reduction from IKP to the affine reachability problem 
over Z, showing that the latter is NP-hard. The reduction to affine reachability over N is 
exactly the same, since all F-compositions are valid. □ 

5. Conclusion 

We gave 2-EXPTIME algorithms for the affine reachability problems over Z and N, and 
showed that they are NP-hard. Beyond improving these upper and lower bounds, a natural 
generalization that would be interesting to consider is if integer or integer- valued polyno- 
mials are allowed instead of just affine functions. Also, the original problem which this 
paper treated a special case of, namely reachability for affine evolution over Q, remains 
open. This provides another clear direction for future work. 
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